NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SBO-RNN: Reformulating Recurrent Neural Networks via Stochastic Bilevel Optimization

Zhang, Z.; Yue, Y.; Wu, G.; Li, Y.; Zhang, H. (December 2021, Thirty-fifth Conference on Neural Information Processing Systems)

In this paper we consider the training stability of recurrent neural networks (RNNs) and propose a family of RNNs, namely SBO-RNN, that can be formulated using stochastic bilevel optimization (SBO). With the help of stochastic gradient descent (SGD), we manage to convert the SBO problem into an RNN where the feedforward and backpropagation solve the lower and upper-level optimization for learning hidden states and their hyperparameters, respectively. We prove that under mild conditions there is no vanishing or exploding gradient in training SBO-RNN. Empirically we demonstrate our approach with superior performance on several benchmark datasets, with fewer parameters, less training data, and much faster convergence. Code is available at https://zhang-vislab.github.io.
more » « less
Full Text Available
An early transition to magnetic supercriticality in star formation

https://doi.org/10.1038/s41586-021-04159-x

Ching, T.-C.; Li, D.; Heiles, C.; Li, Z.-Y.; Qian, L.; Yue, Y. L.; Tang, J.; Jiao, S. H. (January 2022, Nature)

Abstract Magnetic fields have an important role in the evolution of interstellar medium and star formation 1,2 . As the only direct probe of interstellar field strength, credible Zeeman measurements remain sparse owing to the lack of suitable Zeeman probes, particularly for cold, molecular gas 3 . Here we report the detection of a magnetic field of +3.8 ± 0.3 microgauss through the H I narrow self-absorption (HINSA) 4,5 towards L1544 6,7 —a well-studied prototypical prestellar core in an early transition between starless and protostellar phases 8–10 characterized by a high central number density 11 and a low central temperature 12 . A combined analysis of the Zeeman measurements of quasar H I absorption, H I emission, OH emission and HINSA reveals a coherent magnetic field from the atomic cold neutral medium (CNM) to the molecular envelope. The molecular envelope traced by the HINSA is found to be magnetically supercritical, with a field strength comparable to that of the surrounding diffuse, magnetically subcritical CNM despite a large increase in density. The reduction of the magnetic flux relative to the mass, which is necessary for star formation, thus seems to have already happened during the transition from the diffuse CNM to the molecular gas traced by the HINSA. This is earlier than envisioned in the classical picture where magnetically supercritical cores capable of collapsing into stars form out of magnetically subcritical envelopes 13,14 .
more » « less
Full Text Available
Safety-Aware Preference-Based Learning for Safety-Critical Control

Cosner, R.; Tucker, M.; Taylor, A.; Li, K.; Molnar, T.; Ubelacker, W.; Alan, A.; Orosz, G.; Yue, Y.; Ames, A. (January 2022, 4th Annual Learning for Dynamics and Control Conference, PMLR)

Full Text Available
Learning to make decisions via submodular regularization

Alieva, A.; Aceves, A.; Song, J.; Mayo, S.; Yue, Y.; Chen, Y. (October 2020, International Conference on Learning Representations)

Many sequential decision making tasks can be viewed as combinatorial optimiza- tion problems over a large number of actions. When the cost of evaluating an ac- tion is high, even a greedy algorithm, which iteratively picks the best action given the history, is prohibitive to run. In this paper, we aim to learn a greedy heuris- tic for sequentially selecting actions as a surrogate for invoking the expensive oracle when evaluating an action. In particular, we focus on a class of combinato- rial problems that can be solved via submodular maximization (either directly on the objective function or via submodular surrogates). We introduce a data-driven optimization framework based on the submodular-norm loss, a novel loss func- tion that encourages the resulting objective to exhibit diminishing returns. Our framework outputs a surrogate objective that is efficient to train, approximately submodular, and can be made permutation-invariant. The latter two properties al- low us to prove strong approximation guarantees for the learned greedy heuristic. Furthermore, our model is easily integrated with modern deep imitation learning pipelines for sequential prediction tasks. We demonstrate the performance of our algorithm on a variety of batched and sequential optimization tasks, including set cover, active learning, and data-driven protein engineering.
more » « less
Full Text Available
Co-training for Policy Learning

Song, JL; Lanka, RA; Yue, Y; Ono, MSRO (January 2020, 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019))
Adams, RP; Gogate V (Ed.)
We study the problem of learning sequential decision-making policies in settings with multiple state-action representations. Such settings naturally arise in many domains, such as planning (e.g., multiple integer programming formulations) and various combinatorial optimization problems (e.g., those with both integer programming and graph-based formulations). Inspired by the classical co-training framework for classification, we study the problem of co-training for policy learning. We present sufficient conditions under which learning from two views can improve upon learning from a single view alone. Motivated by these theoretical insights, we present a meta-algorithm for co-training for sequential decision making. Our framework is compatible with both reinforcement learning and imitation learning. We validate the effectiveness of our approach across a wide range of tasks, including discrete/continuous control and combinatorial optimization.
more » « less
Full Text Available
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis

Park, J.Y.; Carr, K. T.; Zheng, S.; Yue, Y.; Yu, R. (January 2020, International Conference on Machine Learning)

Full Text Available
A repeating fast radio burst associated with a persistent radio source

https://doi.org/10.1038/s41586-022-04755-5

Niu, C.-H.; Aggarwal, K.; Li, D.; Zhang, X.; Chatterjee, S.; Tsai, C.-W.; Yu, W.; Law, C. J.; Burke-Spolaor, S.; Cordes, J. M.; et al (June 2022, Nature)

Abstract The dispersive sweep of fast radio bursts (FRBs) has been used to probe the ionized baryon content of the intergalactic medium 1 , which is assumed to dominate the total extragalactic dispersion. Although the host-galaxy contributions to the dispersion measure appear to be small for most FRBs 2 , in at least one case there is evidence for an extreme magneto-ionic local environment 3,4 and a compact persistent radio source 5 . Here we report the detection and localization of the repeating FRB 20190520B, which is co-located with a compact, persistent radio source and associated with a dwarf host galaxy of high specific-star-formation rate at a redshift of 0.241 ± 0.001. The estimated host-galaxy dispersion measure of approximately $${903}_{-111}^{+72}$$ 903 − 111 + 72 parsecs per cubic centimetre, which is nearly an order of magnitude higher than the average of FRB host galaxies 2,6 , far exceeds the dispersion-measure contribution of the intergalactic medium. Caution is thus warranted in inferring redshifts for FRBs without accurate host-galaxy identifications.
more » « less
Full Text Available
A fast radio burst source at a complex magnetized site in a barred galaxy

https://doi.org/10.1038/s41586-022-05071-8

Xu, H; Niu, J R; Chen, P; Lee, K J; Zhu, W W; Dong, S; Zhang, B; Jiang, J C; Wang, B J; Xu, J W; et al (September 2022, Nature)

Full Text Available
A bimodal burst energy distribution of a repeating fast radio burst source

https://doi.org/10.1038/s41586-021-03878-5

Li, D.; Wang, P.; Zhu, W. W.; Zhang, B.; Zhang, X. X.; Duan, R.; Zhang, Y. K.; Feng, Y.; Tang, N. Y.; Chatterjee, S.; et al (October 2021, Nature)

Full Text Available
STCF conceptual design report (Volume 1): Physics & detector

https://doi.org/10.1007/s11467-023-1333-z

Achasov, M.; Ai, X. C.; An, L. P.; Aliberti, R.; An, Q.; Bai, X. Z.; Bai, Y.; Bakina, O.; Barnyakov, A.; Blinov, V.; et al (February 2024, Frontiers of Physics)

Abstract The superτ-charm facility (STCF) is an electron–positron collider proposed by the Chinese particle physics community. It is designed to operate in a center-of-mass energy range from 2 to 7 GeV with a peak luminosity of 0.5 × 10³⁵cm⁻²·s⁻¹or higher. The STCF will produce a data sample about a factor of 100 larger than that of the presentτ-charm factory — the BEPCII, providing a unique platform for exploring the asymmetry of matter-antimatter (charge-parity violation), in-depth studies of the internal structure of hadrons and the nature of non-perturbative strong interactions, as well as searching for exotic hadrons and physics beyond the Standard Model. The STCF project in China is under development with an extensive R&D program. This document presents the physics opportunities at the STCF, describes conceptual designs of the STCF detector system, and discusses future plans for detector R&D and physics case studies.
more » « less
Full Text Available

Search for: All records